SQL Server 2008 : Replication - Subscriptions

8/15/2011 11:42:48 AM

A subscription is essentially a formal request and registration of that request for data that is being published. By definition, you subscribe to all articles of a publication.

When a subscription is being set up, you have the option of either having the data “pushed” to the subscriber server or “pulling” the data to the subscription server when it is needed. This is referred to as either a push subscription or pull subscription.

As shown in Figure 1 , a pull subscription is set up and managed by the subscription server. The biggest advantage here is that pull subscriptions allow the system administrators of the subscription servers to choose what publications they will receive and when they receive them. With pull subscriptions, publishing and subscribing are separate acts and are not necessarily performed by the same user. In general, pull subscriptions are best when the publication does not require high security or if subscribing is done intermittently when the subscriber’s data needs to be periodically brought up to date.

Figure 1. Push or pull subscriptions.

As you can also see in Figure 19.17 , a push subscription is created and managed by the publication and distribution server. In effect, the distribution server and all the agents that do the work are pushing the publication to the subscription server. The advantage of using push subscriptions is that all the administration takes place in a central location (on the publication/distribution server side). In addition, publishing and subscribing happen at the same time, and many subscribers can be set up at once. This type of subscription is also recommended when dealing with heterogeneous subscribers because of the lack of pull capability on the subscription server side.

Anonymous Subscriptions (Pull Subscriptions)

It is possible to have “anonymous” subscriptions. An anonymous subscription is a special type of pull subscription that can be used in the following circumstances:

When you are publishing data to the Internet
When you have a huge number of subscribers
When you don’t want the overhead of maintaining extra information at the publisher or distributor
When all the rules of your pull subscriptions apply to all your anonymous subscribers

Normally, information about all the subscribers, including performance data, is stored on the distribution server. Therefore, if you have a large number of subscribers or you do not want to track detailed information about the subscribers, you might want to allow anonymous subscriptions to a publication. Then little is kept at the distribution server, but it then becomes the responsibility of the subscriber to initiate the subscription and to keep synchronized.

The Distribution Database

The distribution database is a special type of database installed on the distribution server. This database, which is as a store-and-forward database, holds all transactions waiting to be distributed to any subscribers. This database receives transactions from any published databases that have designated it as their distributor. The transactions are held here until they are sent to the subscribers successfully. After a period of time, these transactions are purged from the distribution database. In some special situations, the transactions might not be purged for a longer period, enabling anonymous subscribers ample time to synchronize. The distribution database is the heart of the data replication facility. As you can see in Figure 2 , the distribution database has several MS tables, such as MSarticles. These tables contain all the necessary information for the distribution server to fulfill the distribution role. Following are some of these tables:

Figure 2. Tables of the distribution database and the distribution agents.

All the different publishers who will use this distribution server— Stored in the MSpublisher_databases and MSpublication_access tables.
The publications and articles that will be distributed— Stored in the MSpublications and MSarticles tables.
The complete information for all the distribution agents to perform their tasks— Stored in the MSdistribution_agents table.
The complete information of the executions of these agents— Stored in the MSdistribution_history table.
The subscribers— Stored in MSsubscriber_info, MSsubscriptions, and other related tables.
Any errors that occur during replication and synchronization states— Stored in MSrepl_errors, MSsync_state, and related tables.
The actual commands and transactions that are to be replicated— Stored in the MSrepl_commands and MSrepl_transactions tables.
Heterogeneous (non-SQL Server) publishers’ or subscribers’ information— Kept in the tables whose names begin with IH, such as IHpublishers, that will contain one row for each non-SQL Server publisher for which this distribution server distributes information.